.TL
numerics entry test
.AU
strlst
.SH
attempt 4
.NH
.EQ
define repr `~sub { ~ $3 } ($1) sub { $2 ~ }`
define binary `repr($1, 2)`
define bin    `repr($1, 2)`
define is `~~=~~`
define corresponds `~{~= hat}~~`
delim $$
.EN
.PP
Given the following binary number $ bin(010011110) $, interpret it as $ repr(x, 2, fp) $ (where $ fp ... fixed point$) and calculate its decimal value. 
.EQ
bin(010011110) mark corresponds repr(010011110, 2, fp) is bin(1001.1110)
.EN
.EQ
lineup is ( 8 + 1 + 2 sup -1 + 2 sup -2 + 2 sup -3 )
.EN
.EQ
lineup is ( 9 + 1 smallover 2 + 1 smallover 4 + 1 smallover 8 )
.EN
.EQ
lineup is ( 9 + { 4 + 2 + 1 } smallover 8 ) is ( 9 + 7 smallover 8 ) is repr(9.875, 10)
.EN
.EQ
7 times 1 smallover 8 is 7 times 0.125 is 0.5 + 3 times 0.125 is 0.875
.EN
.NH
.PP
Add the following encoded binary numbers $ A = bin(0001011010011101), B = bin(0010011001111111) $ represented by the system $ F(2,11,-14,15,true) $ (IEEE 754-2008 with half precision) using $ (round to nearest - round to even ) $ as a rounding scheme. The resulting number should be encoded in the same format. 
.PP
The task at hand can be represented as follows:
.LP
.ft CW
.EX
  vz | e   | m        |g|r|s
  0   00101 1010011101
 +0   01001 1001111111

    e
   01001
  -00101
 = 00100
.EE
.ft
.PP
Because $ A < B $, we adjust our exponent of number A to that of number B. As we add $ e sub B - e sub A = bin(00100) $ to $ e sub A $, we shift by $ bin(00100) = repr(4, 10) $ digits:
.LP
.ft CW
.EX
   vz | e   | m        |g|r|s
   0   00101 1010011101
  +0   01001 1001111111
 = 0   01001 0001101001 1 1 0 1
  +0   01001 1001111111
 = 0   01001 0001101001 1 1 1
  +0   01001 1001111111
 = 0   01001 1011101000 1 1 1
 = 0   01001 1011101001
.EE
.ft
.PP
The result is $ repr(0010011011101001, 2) $

.NH
.PP
Subtract the following encoded binary number $ bin(1000101001110111) $ from the following encoded binary number $ bin(1110101011110100) $, both numbers being represented using the system $ F(2,11,-14,15,true) $ (IEEE 754-2008 with half precision) using $ (round to nearest - round to even ) $ as a rounding scheme. The resulting number should be encoded in the same format. 
.PP
The task at hand can be represented as follows:
.LP
.ft CW
.EX
  vz | e   | m        |g|r|s
  1   11010 1011110100
 -1   00010 1001110111

    e
   11010
  -00010
 = 11000
.EE
.ft
.PP
Because we are subtracting the negative number $ B $ from the negative number $ A $, we can reformulate our task:
.EQ
(-A) - (-B) is (-A) + B is B - A
.EN
.PP
Additionally, like for addition, we inspect the absolute exponent distance. Because $ A > B $, we shift $ B $ by $ e sub a - e sub b = bin(11000) = repr(24, 10) $ .
.LP
.ft CW
.EX
   vz | e   | m        |g|r|s
   1   11010 1011110100
  -1   00010 1001110111
 = 0   00010 1001110111
  -1   11010 1011110100
 = 0   11010 0000000000 0 0 0 000000000011001110111
  +0   11010 1011110100
  +1
 = 0   11010 0000000000 0 0 1
  +0   11010 1011110100
  +1
 = 1   11010 1011110100
.EE
.ft
.PP
The result is $ repr(1110101011110100, 2) $

.NH
.PP
Multiply the following encoded binary numbers $ bin(0110010110101100), bin(1001100000100001) $ represented by the system $ F(2,11,-14,15,true) $ (IEEE 754-2008 with half precision) using $ (round to nearest - round to even ) $ as a rounding scheme. The resulting number should be encoded in the same format. 
.PP
The task at hand can be represented as follows:
.LP
.ft CW
.EX
  vz | e   | m        |g|r|s
  0   11001 0110101100
 *1   00110 0000100001
.EE
.ft
.PP
We XOR the sign bit, add the exponents and multiply the mantissas. Adding the exponents would not cause an overflow, but we should subtract $ e $ regardless. 
.EQ
e sub { common } is e sub B  - e + e sub A
.EN
.LP
.ft CW
.EX
   11001
  -01111
  +00110 
 = 01010
  +00110 
 = 10000
.ft
.EE
.PP
Thus our common exponent is $ bin(10000) $. Our sign bit is $ 0 ~~hat~ 1 = 1 $.
.LP
.ft CW
.EX
 (1) 0110 1011 00 * (1) 0000 1000 01

= 1  0110 1011 00
 +   0000 0000 00 0
 +    000 0000 00 00
 +     00 0000 00 000
 +      0 0000 00 0000
 +        1011 01 0110 0
 +         000 00 0000 00
 +          00 00 0000 000
 +           0 00 0000 0000
 +             00 0000 0000 0
 +              1 0110 1011 00

= 1  0111 0110 10 1100 1011 00
.EE
.ft
.PP
Taking our results thus far, we can finalize our calculations:
.LP
.ft CW
.EX
   vz | e   | m        |g|r|s
   0   11001 0110101100
  *1   00110 0000100001
 = 1   10000 0111011010 1 1 0 0101100
 = 1   10000 0111011010 1 1 1
 = 1   10000 0111011011
.EE
.EE
.ft
.PP
The result is $ bin(1100000111011011) $
